Add more metrics for reindex #137597

samxbr · 2025-11-04T23:57:07Z

Adds a new metric es.reindex.completion.total to track the number of completed reindex operations, along with these metric attributes to identify different results:

error.type: if present, indicates the reindex failed with the specified exception. Otherwise indicates the reindex was sucessful
reindex.source: local or remote, indicates whether the source cluster was the local or a remote cluster
- this attribute is also added to the existing es.reindex.duration.histogram metric

...dex/src/internalClusterTest/java/org/elasticsearch/index/reindex/ReindexPluginMetricsIT.java

modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java

elasticsearchmachine · 2025-11-05T04:29:15Z

Pinging @elastic/es-data-management (Team:Data Management)

PeteGillinElastic

Thanks @samxbr . I haven't done a full review on this, since you said you were looking for early feedback, but here are some initial thoughts.

We also talked about attempting to get lost operations (due to node restart). Did you look at how the existing chart for that works? Is it grepping the logs? Do you know where that logging is done? I'm wondering whether we want to try to figure out how to distinguish remote vs local in there, too.

...dex/src/internalClusterTest/java/org/elasticsearch/index/reindex/ReindexPluginMetricsIT.java

modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java

modules/reindex/src/main/java/org/elasticsearch/reindex/ReindexMetrics.java

samxbr · 2025-11-06T02:03:51Z

We also talked about attempting to get lost operations (due to node restart). Did you look at how the existing chart for that works? Is it grepping the logs? Do you know where that logging is done? I'm wondering whether we want to try to figure out how to distinguish remote vs local in there, too.

Good question, I assume you are referring to the Reindexing tasks failures logged during shutdown graph, I think it is searching for this log, judging from the search query:

         {
          "match_phrase": {
            "log.logger": "org.elasticsearch.node.ShutdownPrepareService"
          }
        },
        {
          "match_phrase": {
            "log.level": "WARN"
          }
        },
        {
          "match_phrase": {
            "message": "*reindex*"
          }
        }

We probably need to implement something else to capture remote/local there, I can take a deeper look on that later.

PeteGillinElastic · 2025-11-06T13:55:13Z

We also talked about attempting to get lost operations (due to node restart). Did you look at how the existing chart for that works? Is it grepping the logs? Do you know where that logging is done? I'm wondering whether we want to try to figure out how to distinguish remote vs local in there, too.

Good question, I assume you are referring to the Reindexing tasks failures logged during shutdown graph, I think it is searching for this log, judging from the search query:
         {
          "match_phrase": {
            "log.logger": "org.elasticsearch.node.ShutdownPrepareService"
          }
        },
        {
          "match_phrase": {
            "log.level": "WARN"
          }
        },
        {
          "match_phrase": {
            "message": "*reindex*"
          }
        }
We probably need to implement something else to capture remote/local there, I can take a deeper look on that later.

Yes, that's what I was referring to. Thanks.

PeteGillinElastic · 2025-11-06T13:58:40Z

We also talked about attempting to get lost operations (due to node restart). Did you look at how the existing chart for that works? Is it grepping the logs? Do you know where that logging is done? I'm wondering whether we want to try to figure out how to distinguish remote vs local in there, too.

Good question, I assume you are referring to the Reindexing tasks failures logged during shutdown graph, I think it is searching for this log, judging from the search query:
         {
          "match_phrase": {
            "log.logger": "org.elasticsearch.node.ShutdownPrepareService"
          }
        },
        {
          "match_phrase": {
            "log.level": "WARN"
          }
        },
        {
          "match_phrase": {
            "message": "*reindex*"
          }
        }
We probably need to implement something else to capture remote/local there, I can take a deeper look on that later.
Yes, that's what I was referring to. Thanks.

Looking at the code, I don't think it's going to be straightforward to get the remote/local info in there. The code you linked is generic task-related code, and we are only identifying reindex tasks via a regex on the task name. I assume that changing the task name so that it included remote would be somewhat involved, and might be risky if other stuff is dependent on the naming convention (not sure whether it is).

PeteGillinElastic

Thanks Sam! Just one substantive comment (and one nit).

modules/reindex/src/main/java/org/elasticsearch/reindex/ReindexMetrics.java

BASE=afd3a426eabdfda7d4fd6b0c52d76162e3c9c47e HEAD=26abb9d1597bc46b560996f1854ea01e858f061f Branch=main

Add reindex from remote metrics

f11a9c7

elasticsearchmachine added the v9.3.0 label Nov 4, 2025

Merge branch 'main' into reindex/add-metrics

290fa48

samxbr commented Nov 5, 2025

View reviewed changes

...dex/src/internalClusterTest/java/org/elasticsearch/index/reindex/ReindexPluginMetricsIT.java Show resolved Hide resolved

modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java Show resolved Hide resolved

Merge branch 'main' into reindex/add-metrics

6566a55

samxbr added :Data Management/Indices APIs APIs to create and manage indices and templates >non-issue labels Nov 5, 2025

samxbr marked this pull request as ready for review November 5, 2025 04:28

elasticsearchmachine added the Team:Data Management Meta label for data/management team label Nov 5, 2025

PeteGillinElastic reviewed Nov 5, 2025

View reviewed changes

samxbr added 2 commits November 5, 2025 11:13

Merge branch 'main' into reindex/add-metrics

70ed46d

Use attributes and add more tests

978f34c

PeteGillinElastic reviewed Nov 6, 2025

View reviewed changes

modules/reindex/src/main/java/org/elasticsearch/reindex/ReindexMetrics.java Show resolved Hide resolved

modules/reindex/src/main/java/org/elasticsearch/reindex/ReindexMetrics.java Outdated Show resolved Hide resolved

samxbr added 2 commits November 6, 2025 12:40

fix constant

84afac9

Merge branch 'main' into reindex/add-metrics

26abb9d

phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Nov 7, 2025

Mirror upstream elastic#137597 as single snapshot commit for AI review

8c7858e

BASE=afd3a426eabdfda7d4fd6b0c52d76162e3c9c47e HEAD=26abb9d1597bc46b560996f1854ea01e858f061f Branch=main

phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Nov 8, 2025

Mirror upstream elastic#137597 as single snapshot commit for AI review

9df479c

BASE=afd3a426eabdfda7d4fd6b0c52d76162e3c9c47e HEAD=26abb9d1597bc46b560996f1854ea01e858f061f Branch=main

samxbr added 7 commits November 17, 2025 14:58

Merge branch 'main' into reindex/add-metrics

ec63e05

Use Counter

aa955f7

rename

83282af

Merge branch 'main' into reindex/add-metrics

0a4e911

Merge branch 'main' into reindex/add-metrics

be783e6

Merge branch 'main' into reindex/add-metrics

03e2018

fix metrics name

5e07f7c

samxbr requested a review from PeteGillinElastic November 19, 2025 17:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add more metrics for reindex #137597

Add more metrics for reindex #137597

samxbr commented Nov 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 5, 2025

Uh oh!

PeteGillinElastic left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

samxbr commented Nov 6, 2025 •

edited

Loading

Uh oh!

PeteGillinElastic commented Nov 6, 2025

Uh oh!

PeteGillinElastic commented Nov 6, 2025

Uh oh!

PeteGillinElastic left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add more metrics for reindex #137597

Are you sure you want to change the base?

Add more metrics for reindex #137597

Conversation

samxbr commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 5, 2025

Uh oh!

PeteGillinElastic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

samxbr commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PeteGillinElastic commented Nov 6, 2025

Uh oh!

PeteGillinElastic commented Nov 6, 2025

Uh oh!

PeteGillinElastic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

samxbr commented Nov 4, 2025 •

edited

Loading

samxbr commented Nov 6, 2025 •

edited

Loading